30 research outputs found

    Foundations of Multidimensional Network Analysis

    Full text link
    Abstract—Complex networks have been receiving increasing attention by the scientific community, thanks also to the increas-ing availability of real-world network data. In the last years, the multidimensional nature of many real world networks has been pointed out, i.e. many networks containing multiple connections between any pair of nodes have been analyzed. Despite the importance of analyzing this kind of networks was recognized by previous works, a complete framework for multidimensional network analysis is still missing. Such a framework would enable the analysts to study different phenomena, that can be either the generalization to the multidimensional setting of what happens in monodimensional network, or a new class of phenomena induced by the additional degree of complexity that multidimensionality provides in real networks. The aim of this paper is then to give the basis for multidimensional network analysis: we develop a solid repertoire of basic concepts and analytical measures, which takes into account the general structure of multidimensional networks. We tested our framework on a real world multidimensional network, showing the validity and the meaningfulness of the measures introduced, that are able to extract important, non-random, information about complex phenomena. I

    Predicting User Engagement in Twitter with Collaborative Ranking

    Full text link
    Collaborative Filtering (CF) is a core component of popular web-based services such as Amazon, YouTube, Netflix, and Twitter. Most applications use CF to recommend a small set of items to the user. For instance, YouTube presents to a user a list of top-n videos she would likely watch next based on her rating and viewing history. Current methods of CF evaluation have been focused on assessing the quality of a predicted rating or the ranking performance for top-n recommended items. However, restricting the recommender system evaluation to these two aspects is rather limiting and neglects other dimensions that could better characterize a well-perceived recommendation. In this paper, instead of optimizing rating or top-n recommendation, we focus on the task of predicting which items generate the highest user engagement. In particular, we use Twitter as our testbed and cast the problem as a Collaborative Ranking task where the rich features extracted from the metadata of the tweets help to complement the transaction information limited to user ids, item ids, ratings and timestamps. We learn a scoring function that directly optimizes the user engagement in terms of nDCG@10 on the predicted ranking. Experiments conducted on an extended version of the MovieTweetings dataset, released as part of the RecSys Challenge 2014, show the effectiveness of our approach.Comment: RecSysChallenge'14 at RecSys 2014, October 10, 2014, Foster City, CA, US

    Allaboard a system for exploring urban mobility and optimizing public transport using cellphone data

    Get PDF
    The deep penetration of mobile phones offers cities the ability to opportunistically monitor citizens interactions and use data-driven insights to better plan and manage services. In this context, transit operators can leverage pervasive mobile sensing to better match observed demand for travel with their service offerings. With large scale data on mobility patterns, operators can move away from the costly and resource intensive transportation planning processes prevalent in the West, to a more data-centric view, that places the instrumented user at the center of development. In this framework, using mobile phone data to perform transit analysis and optimization represents a new frontier with significant societal impact, especially in developing countries. Document type: Part of book or chapter of boo

    SaferCity: a System for Detecting Incidents from Social Media

    Get PDF
    This paper presents a system to identify and characterise public safety related incidents from social media, and enrich the situational awareness that law enforcement entities have on potentially-unreported activities happening in a city. The system is based on a new spatio-temporal clustering algorithm that is able to identify and characterize relevant incidents given even a small number of social media reports. We present a web-based application exposing the features of the system, and demonstrate its usefulness in detecting, from Twitter, public safety related incidents occurred in New York City during the Occupy Wall Street protests

    Can we assess mental health through social media and smart devices? addressing bias in methodology and evaluation.

    Get PDF
    Predicting mental health from smartphone and social media data on a longitudinal basis has recently attracted great interest, with very promising results being reported across many studies. Such approaches have the potential to revolutionise mental health assessment, if their development and evaluation follows a real world deployment setting. In this work we take a closer look at state-of-the-art approaches, using different mental health datasets and indicators, different feature sources and multiple simulations, in order to assess their ability to generalise. We demonstrate that under a pragmatic evaluation framework, none of the approaches deliver or even approach the reported performances. In fact, we show that current state-of-the-art approaches can barely outperform the most naive baselines in the real-world setting, posing serious questions not only about their deployment ability, but also about the contribution of the derived features for the mental health assessment task and how to make better use of such data in the future

    The struggle for existence in the world market ecosystem

    Get PDF
    The global trade system can be viewed as a dynamic ecosystem in which exporters struggle for resources: the markets in which they export. We can think that the aim of an exporter is to gain the entirety of a market share (say, car imports from the United States). This is similar to the objective of an organism in its attempt to monopolize a given subset of resources in an ecosystem. In this paper, we adopt a multilayer network approach to describe this struggle. We use longitudinal, multiplex data on trade relations, spanning several decades. We connect two countries with a directed link if the source country's appearance in a market correlates with the target country's disappearing, where a market is defined as a country-product combination in a given decade. Each market is a layer in the network. We show that, by analyzing the countries' network roles in each layer, we are able to classify them as out-competing, transitioning or displaced. This classification is a meaningful one: when testing the future export patterns of these countries, we show that out-competing countries have distinctly stronger growth rates than the other two classes

    Graph and network data: mining the temporal dimension

    Get PDF
    In the last years, there have been many studies on analyzing network and graph data. A wide range of problems, such as studying the global and local properties of a graph, finding interesting structures, modeling particular characteristics, assessing the properties of some particular networks such as the Web or a co-authorship networks, have increased the attention of the scientific community, involved in finding efficient and powerful techniques to enable the achievement of the desired results. For example, with the aim of finding interesting and frequent substructures in graphs, algorithms such as AGM, FSG, gSpan, Gaston, FFSMY, ADI-Mine, HSIGRAM and VSIGRAM have been presented for improving scalability on mining subgraphs one after one. However, only in the last few years the attention has moved to a particular aspect of graphs and networks: the temporal dimension. Thanks also to the larger availability of online social network services, the amount of data that allows for the analysis of the dynamics of complex networks has increased very fast in the last five years. This kind of data contains rich information about what happens to a network during time, and enables the analysts to model and discover interesting properties related to the temporal dimension, which are both meaningless and impossible in the static setting. The temporal dimension can play a double role for a network. First, the underlying structure, namely the graph, can evolve over time, showing new users joining a community, new connections created among users, change of properties of a particular group of people, and so on. Second, given an established network, users may perform actions during time, leading to flows of information circulating among the connections, sequences of tasks performed by a sequence of users, spread of influence among the network, and so on. Despite the clear richness of the above setting, the current graph mining techniques are somehow too generic, and they do not explicitly take into consideration the time during their stages. In order to overcome to this problem, in this thesis we study the current graph mining algorithms, we study the possibility of pushing constraints during the computation that would allow us to efficiently analyze the temporal dimension at mining stage, and we develop new techniques that can help in this kind of analysis. In order to prove the effectiveness of our approach, we apply a pre-existent graph miner, a modified version of it specialized to deal with the temporal dimension, and another pre-existent tool of analysis, namely the Temporally Annotated Sequences framework, to real data, to show how we can deal with the above setting, with particular focus on problems such as mining the information propagation in a network, mining graph evolution rules, and mining the temporal dimension of process logs to derive the actual workflow diagram in a process. Our results justify the need for this approach, and show that specialized techniques help in modeling and analyzing temporal graph and network data

    PATTERN DISCOVERY SU GENOTIPI PER L’INFERENZA DI APLOTIPI RESPONSABILI DI MALATTIE DEL FEGATO

    No full text
    Il grande afflusso di dati provenienti dalla biologia apre la strada per nuove applicazioni di elevato valore scientifico e sociale. L’applicazione di adeguate tecniche di Pattern Discovery a grandi basi di dati genetici, clinici e demografici possono portare alla luce correlazioni non note in grado di aiutare il biologo e il medico nella comprensione di una malattia. In questo contesto si colloca il lavoro di tesi svolto in collaborazione con l’U.O. di Immunoematologia dell’Azienda Ospedaliero-Universitaria Pisana. Partito dall’analisi di un database di pazienti con vari tipi di cirrosi, ha prodotto la conferma di associazioni fra DNA e tipo di malattia sviluppata gi&agrave note, e la scoperta di nuove associazioni in corso di valutazione da un esperto nel campo della Biologia. E' stato spontaneo chiedersi se questi pattern di genotipi scoperti costituissero frammenti di aplotipi, poich`e questo caso costituirebbe uno scenario biologico pi&ugrave interessante, e quindi &egrave stato studiato il problema dell’inferenza dell’aplotipo, per arrivare ad una possibile soluzione con algoritmi di Pattern Discovery
    corecore